Unsupervised versus supervised training of acoustic models
نویسندگان
چکیده
In this paper we reports unsupervised training experiments we have conducted on large amounts of the English Fisher conversational telephone speech. A great amount of work has been reported on unsupervised training, but the major difference of this work is that we compared behaviors of unsupervised training with supervised training on exactly the same data. This comparison reveals surprising results. First, as the amount of training data increases, unsupervised training, even bootstrapped with a very limited amount (1 hour) of manual data, improves recognition performance faster than supervised training does, and it converges to supervised training. Second, bootstrapping unsupervised training with more manual data is not of significance if a large amount of un-transcribed data is available.
منابع مشابه
Unsupervised Testing Strategies for ASR
This paper describes unsupervised strategies for estimating relative accuracy differences between acoustic models or language models used for automatic speech recognition. To test acoustic models, the approach extends ideas used for unsupervised discriminative training to include a more explicit validation on held out data. To test language models, we use a dual interpretation of the same proce...
متن کاملLightly supervised and unsupervised acoustic model training
The last decade has witnessed substantial progress in speech recognition technology, with todays state-of-the-art systems being able to transcribe unrestricted broadcast news audio data with a word error of about 20%. However, acoustic model development for these recognizers relies on the availability of large amounts of manually transcribed training data. Obtaining such data is both time-consu...
متن کاملSpeech alignment and recognition experiments for Luxembourgish
Luxembourgish, embedded in a multilingual context on the divide between Romance and Germanic cultures, remains one of Europe’s under-described languages. In this paper, we propose to study acoustic similarities between Luxembourgish and major contact languages (German, French, English) with the help of automatic speech alignment and recognition systems. Experiments were run using monolingual ac...
متن کاملUnsupervised adaptation for acoustic language identification
Our system for automatic language identification (LID) of spoken utterances is performed with language dependent parallel phoneme recognition (PPR) using Hidden Markov Model (HMM) phoneme recognizers and optional phoneme language models (LMs). Such a LID system for continuous speech requires many hours of orthographically transcribed data for training of language dependent HMMs and LMs as well ...
متن کاملUnsupervised Submodular Subset Selection for Speech Data :extended Version
We conduct a comparative study on selecting subsets of acoustic data for training phone recognizers. The data selection problem is approached as a constrained submodular optimization problem. Previous applications of this approach required transcriptions or acoustic models trained in a supervised way. In this paper we develop and evaluate a novel and entirely unsupervised approach, and apply it...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008